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ABSTRACT 



> 

We calculate photometric redshifts from the Sloan Digital Sky Survey Main 
Galaxy Sample, The Galaxy Evolution Explorer All Sky Survey, and The Two 
Micron All Sky Survey using two new training-set methods. We utilize the 
broadband photometry from the three surveys alongside Sloan Digital Sky Survey 
measures of photometric quality and galaxy morphology Our first training-set 
method draws from the theory of ensemble learning while the second employs 
Gaussian process regression both of which allow for the estimation of redshift 



along with a measure of uncertainty in the estimation. The Gaussian process 
models the data very effectively with small training samples of approximately 



1000 points or less. These two methods are compared to a well known Artificial 
Neural Network training-set method and to simple linear and quadratic regres- 
sion. Our results show that robust photometric redshift errors as low as 0.02 
RMS can regularly be obtained. We also demonstrate the need to provide confi- 
dence bands on the error estimation made by both classes of models. Our results 
indicate that variations due to the optimization procedure used for almost all 
neural networks, combined with the variations due to the data sample, can pro- 
duce models with variations in accuracy that span an order of magnitude. A key 
contribution of this paper is to quantify the variability in the quality of results 
as a function of model and training sample. We show how simply choosing the 
"best" model given a data set and model class can produce misleading results. 



Subject headings: Photometric Redshifts, Sloan Digital Sky Survey, Galaxy Evo- 
lution Explorer All Sky Survey, Two Micron All Sky Survey 
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1. INTRODUCTION 

Using broadband photometry in multiple filters to estimate redshifts of galaxies was 
likely first attempted by Baum (1962) on 25 galaxies in nine broadband imaging filters in 
the visible and near-infrared range. Given the low throughput of spectrographs much is 
to be gained by attempting to estimate galaxy redshifts from broadband colors rather than 
from measurement of individual spectra. In the Sloan Digital Sky Survey (SDSS, York et 
al. 2000) 100 million galaxies will have accurate broadband u,g,r,i,z photometry, but only 
1 million galaxy redshifts from this sample will be measured. If a method can be found to 
obtain an accurate estimate of the redshift for the larger SDSS photometric catalog, rather 
than the smaller spectroscopic one, much better constraints on the formation and evolution of 
large-scale structural elements such as galaxy clusters, filaments, and walls and cosmological 
models in general (e.g. Blake & Bridle 2005) may be achieved. 

Two approaches, spectral energy distribution fitting (SED fitting: also known as "template- 
fitting") and the training-set method (TS method), have been used to obtain photometric 
redshifts over the past 30 years. In order to use TS methods galaxies with a similar range 
in magnitude and color over the same possible redshift range must be used to estimate 
the redshifts from the broadband colors measured. Since this type of data has not always 
been available SED fitting has historically been the preferred method (e.g. Koo 1985; Loh 
& Spillar 1986; Lanzetta et al. 1996; Kodama et al. 1999; Benitez 2000; Massarotti et al. 
2001; Babbedge et al. 2004; Padmanabhan et al. 2005) given the historically low numbers 
of galaxies with spectroscopically confirmed redshifts in deep photometric surveys of the 
universe. This is due to the fact that photometric surveys have always gone, and continue to 
go, deeper than is possible with spectroscopy. Another alternative has been to use training 
sets consisting of a combination of both observed galaxy templates and those from galaxy 
evolution models (e.g. hyperz, Bolzonella et al. 2000). 

There are many approaches to SED fitting. For example, Kodama et al. (1999) use 
four-filter (/it BVRI) photometry and a Bayesian classifier using SED fitting which they 
have tested out to z=l and claim is valid beyond this redshift. The approach of Benitez 
(2000) makes use of additional information such as the shape of the redshift distributions 
and fractions of different galaxy types. This may be helpful in instances where one has a 
limited sample size at large redshifts. However, all estimators, Bayesian or otherwise, can 
be biased due to small sample size effects. 

TS methods rely on having a complete sample of galaxies in magnitude, color and 
redshift. Hence these methods have been restricted to relatively nearby z<l surveys, such 
as the SDSS, rather than much deeper surveys such as the Hubble Deep Field (Williams et 
al. 1996). In fact, for redshifts above 1 there have not been sufficiently large and complete 
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enough measured samples of galaxy redshifts, magnitudes and colors to use TS methods 
with much accuracy (e.g. Wang et al. 1998). As well, the colors of galaxies change nearly 
monotonically up to z=l, but beyond this the color-redshift space becomes much more 
complex and simple linear and quadratic regression will fail. Hence SED fitting has been 
used almost exclusively for surveys of z>l. See Benitez (2000) for an excellent detailed 
discussion of the differences and similarities between these two commonly used approaches. 

In the past 10 years a large number of empirical fitting techniques for TS methods have 
come into use and new techniques continue to be developed. Some examples of linear and 
non-linear methods include: 2nd and 3rd order polynomial fitting (Brunner et al. 1997; Wang 
et al. 1998; Budavari 2005); quadratic polynomial fitting (Hsieh et al. 2005; Connolly et al. 
1995); support vector machines (Wadadekar 2005); nearest neighbor and kd-trees (Csabai 
et al. 2003), and artificial neural networks (Firth et al. 2003; Tagliaferri et al. 2003; Ball et 
al. 2004; Collister & Lahav 2004; Vanzella et al. 2004). 

We explore the problem of estimating redshifts from broadband photometric measure- 
ments using the idea of a virtual sensor (Srivastava 2005; Srivastava & Stroeve 2003). These 
methods allow for the estimation of unmeasured spectral phenomena based on learning the 
potentially nonlinear correlations between observed sets of spectral measurements. In the 
case of estimating redshifts, we can learn the nonlinear correlation between spectroscopically 
measured redshifts and broadband colors. Statistically speaking, this amounts to building a 
regression model to estimate the photometric redshift. However, the procedure is much more 
complex than a simple regression due to the significant effort required for model building 
and validation. The concept of virtual sensors applies to the entire chain of analytical steps 
leading up to the prediction of the redshift. Figure 1 shows a schematic of the assumptions 
behind a Virtual Sensor with a cartoon on the left and the real-world case with the five 
SDSS bandpasses and a sample galaxy spectrum overlaid on the right. 

As a baseline comparison, results from a TS-based neural network package called ANNz 
(Collister & Lahav 2004) are presented. Linear and quadratic fits along the lines discussed 
in Connolly et al. (1995) are also presented. Unlike all other previous work, we also discuss 
the application of bootstrap resampling (Efron 1979; Efron & Tibshirani 1993) for the linear, 
quadratic, and ANNz models. 

We apply the TS methods discussed above to the SDSS five-color (ugriz) imaging survey 
known as the Main Galaxy Sample (MGS, Strauss et al. 2002) which has a large calibration 
set of spectroscopic redshifts for the SDSS Data Release 2 (DR2, Abazajian et al. 2004) 
and SDSS Data Release 3 (DR3, Abazajian et al. 2005). The Two Micron All Sky Survey 
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(2MASS, Skrutskie et al. 2006) 1 extended source catalog along with Galaxy Evolution Ex- 
plorer (GALEX, Martin et al. 2005) 2 data are also used in conjunction with the SDSS where 
all three overlap to create a combined catalog for use with our TS methods. 

The data sets used in our analysis are discussed in § 2, discussion of the photometry 
and spectroscopic quality of the data sets along with other photometric pipeline output 
properties of interest is given in § 3, the classification schemes used to obtain photometric 
redshifts are in § 4, comparison of the results takes place in § 5, and we summarize in § 6. 

2. THE SLOAN DIGITAL SKY SURVEY, THE TWO MICRON ALL SKY 
SURVEY and THE GALAXY EVOLUTION EXPLORER DATA SETS 

Most of the work herein is related to the SDSS MGS DR2 and DR3, and the photomet- 
ric quantities associated with them. For completeness we have added the 2MASS extended 
source catalog and GALEX All Sky Survey photometric attributes where data exists for the 
same SDSS MGS galaxies with corresponding redshifts. The 2MASS and GALEX data sam- 
ples are small where they overlap with those of the SDSS MGS galaxies with corresponding 
known spectroscopic redshifts in the DR2 and DR3. However, they appear copious enough 
for our new TS methods as there is no evidence of over-fitting of these smaller data samples. 

The Sloan Digital Sky Survey (York et al. 2000) will eventually encompass roughly 1/4 
of the entire sky, collecting five-band photometric data in 7700 deg 2 down to 23rd magnitude 
in r of order 10 8 celestial objects. For about 1 in every 100 of these objects down to g~20 a 
spectrum will be measured, coming to a total of about 10 6 galaxy and quasar redshifts over 
roughly the same area of the sky (7000 deg 2 ) as the photometric survey (Stoughton et al. 
2002). The five broadband filters used, u,g,r,i and z, cover the optical range of the spectrum 
(Table 1). 

We use several catalogs derived from the SDSS. The MGS (Strauss et al. 2002) of the 
SDSS is a magnitude-limited survey that targets all galaxies down to r Petrosian < 17.77. We 
use the MGS from DR2 and DR3 where spectroscopic redshifts exist in order to validate our 
methods. 

The 2MASS extended source catalog contains positions and magnitudes in j, h, and k s 
filters for 1,647,599 galaxies and other nebulae across the entire sky (Table 1). The extended 



1 ht t p : / / www . ipac . caltech . edu / 2mass / 
2 http://www. galex.caltech.edu/ 
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source magnitude limits in the three filters are j=15.0, h=14.3 and k s =13.5. See Jarrett et 
al. (2000) for more detailed information on the extended source catalog. 

The GALEX data release 1 (GR1) 3 All Sky photometry catalog contains positions and 
magnitudes in two ultraviolet bands called the far ultraviolet Band (FUV) and the near 
ultraviolet band (NUV). See Table 1 for details on these broadband pass filters. Limiting 
magnitudes for the all-sky (100 s integrations) FUV is 19.9, and 20.8 for the NUV. See 
Morrissey et al. (2005) and references therein for more details of the in-orbit instrument 
performance and Martin et al. (2005) for mission details. The all-sky GR1 covers 2792 deg 2 
of the sky. 

3. PHOTOMETRIC AND REDSHIFT QUALITY, MORPHOLOGICAL 
INDICATORS, AND OTHER CATALOG PROPERTIES 

Historically most determinations of photometric redshifts from large photometric sur- 
veys contain only broadband magnitudes without reference to other parameters that may 
have been available from the photometric aperture reductions themselves. With the SDSS 
most papers have utilized only the five band photometry (ugriz) while a host of additional 
parameters like Petrosian radii (Strauss et al. 2002), measures of ellipticity (Stoughton et 
al. 2002), and other derived quantities are readily available from the photometric pipeline 
reductions. 

This section explains the various quality flags used to obtain data from the SDSS pho- 
tometric and redshift catalogs, the photometric catalogs of the 2MASS extended source 
catalog, and the GALEX All Sky Survey. We also explore the mophological indicators most 
likely to yield information related to the prediction of redshifts in the SDSS MGS for our 
TS calculations. The last subsection (§ 3.6) describes the four data set types used in our 
analysis. 

3.1. The SDSS photometric quality flags 

The SDSS photometric pipeline (Lupton et al. 2001) produces a host of quality flags 
(Stoughton et al. 2002, Table 9) giving additional information on how the photometry was 
estimated. The primtarget flag is used to make sure the MGS is chosen and extinction- 
corrected model magnitudes (Stoughton et al. 2002) are used throughout this work (see 



3 http: / /galex.stsci.edu/GRl / 



- 6- 



query in Appendix I). 

Herein we define GOOD and GREAT quality photometry (see Table 2 for a description) 
where ! means NOT: 

GOOD: !BRIGHT and 1BLENDED and ! SATURATED 

GREAT: GOOD and !CHILD and !COSMICRAY and !INTERP 

In this manner one can determine whether a difference in the quality of the photometry 
makes any difference in the errors of the estimated photometric redshifts. The only reason 
not to always use the very best photometry (what we call GREAT in this work) is that 
the total number of galaxies can drop by orders of magnitude and hence one may end up 
sampling a much smaller number of objects. However, not everyones needs are the same 
and hence the quality can be weighted based on what is desirable. See Appendix I for the 
complete SDSS skyserver 4 queries used to obtain the data used in this paper. 



3.2. The SDSS redshift quality flags 

The SDSS spectroscopic survey (Stoughton et al. 2002; Newman et al. 2004) has several 
flags to warn the user of poor-quality redshifts that come from the spectroscopic pipeline 
reductions (Stoughton et al. 2002). This is important because an inaccurate training set will 
result in poor results no matter which method is used. To this end we utilized an estimate 
of the confidence of the spectroscopic redshift called zConf. Hence only those galaxies with 
zConf>0.95 in the MGS are chosen. Other authors (e.g. Wadadekar 2005) have chosen to 
use only the zWarning flag set to zero. Our studies find zConf values far below that of 0.95 
when only the zWarning=0 flag is set. This may put into question the reliability of such 
redshift estimates. In addition, by setting zConf to values greater than 0.95, as we have 
done, the zWarning=0 flag is also included. Extensive color-color, color-magnitude and 
magnitude error plots were checked against galaxies with values of zConf<=0.95 and those 
with zConf>0.95. No clustering was found in any of these plots related to zConf values and 
hence no color or magnitude bias is introduced by the exclusion of zConf<=0.95 data. 



4 http://casjobs. sdss.org 
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Table 1: Survey filters and characteristics 



Bandpass 


Survey 


X eff 


AA 


FWHM 1 






(A) 


(A) 


(") 


FUV 


GALEX 


1528 


442 


4.5 


NUV 


GALEX 


2271 


1060 


6.0 


u 


SDSS 


3551 


600 


1-2 


g 


SDSS 


4686 


1400 


1-2 


r 


SDSS 


6165 


1400 


1-2 


i 


SDSS 


7481 


1500 


1-2 


z 


SDSS 


8931 


1200 


1-2 


j 


2MASS 


12500 


1620 


2-3 


h 


2MASS 


16500 


2510 


2-3 




2MASS 


21700 


2620 


2-3 



1 The Full Width at Half Maximum is dependent on the seeing at the time of the observation for ground based 
data. 



Table 2: Photometric Quality Flags used in this paper 



Name 



Bitmask Description 



BRIGHT 0x00002 Object detected in first bright object finding step; generally brighter than r=17.5 

BLENDED 0x00008 Object had multiple peaks detected within it 

SATURATED 0x40000 Object contains one or more saturated pixels 

CHILD 0x00010 Object product of attempt to deblend BLENDED object 

COSMICRAY 0x01000 Contains pixel interpreted to be part of a cosmic ray 

INTERP 0x20000 Object contains pixel(s) values determined by interpolation 



a Stoughton ct al. (2002) 
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3.3. 2MASS photometric quality and cross-reference with the SDSS 

Given the high quality constraints of the published photometry of the 2MASS extended 
source public release catalog (Jarrett et al. 2000), only one quality flag is checked. The 
extended source catalog confusion flag, "cc_flg", is required to be zero in all three band 
passes. 

The j_m_k20fe, h_m_k20fe, and k_m_k20fe isophotal fiducial elliptical aperture magni- 
tudes as defined in the 2MASS database are extracted for the respectively described j, h, 
and k s 2MASS magnitudes used in this work. 

The extended source catalog was loaded into our local SQL database containing the 
SDSS DR2 to create a combined catalog (see next section). 

3.4. GALEX photometric quality and cross-reference with the SDSS 

Near-ultraviolet (nuv) and far-ultraviolet (fuv) broadband photometry are extracted 
from the GALEX database for our use. Several quality flags are used to make sure the 
data are of the highest quality. Bad photometry values in nuv photometry (nuv_mag) and 
fuv photometry (fuv_mag) are given the value of -99 in the GR1 database, and these are 
excluded from our catalog if either or both filters contain such a value. The nuv_artifact=0 
flag is set to avoid all objects with known bad photometry artifacts. Hence if nuv_artifact has 
any value other than zero the nuv_mag is considered bad. Currently fuv_artifact is always 
zero in the GR1. The band=3 flag is used since it indicates detection in both nuv and fuv 
bands. Finally, a value of fov_radius<0.55 is required as this is the minimum recommended 
value to make sure the distance of the object in degrees from the center of the field of view 
of the telescope is not too large, as this is known to cause problems in the quality of the 
photometry obtained. 

As with the 2MASS extended source catalog, the GALEX All Sky Survey data were 
loaded into our local SQL database now containing the SDSS DR2 and 2MASS catalogs. The 
SDSS MGS with redshifts and the 2MASS extended source catalogs were cross-referenced 
with GALEX when all three catalog positions agreed to within 5". The methods and results 
used are comparable to those of Seibert et al. (2005): hence we do not go further into a 
description of the combined catalog. See Appendix I for a sample query. 
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3.5. SDSS Petrosian Radii, Inverse Concentration Index, FracDev, and Stokes 

The photometry properties discussed below are available in all five SDSS bandpasses 
(ugriz), but we use the r-bandpass values for these quantities as, in general, the r-band 
result has the lowest error and gives more consistent results. This is also reasonable given 
the low redshifts used, but this strategy would be questionable at higher redshifts when 
morphological features in the rest frame r band start to get more strongly shifted to the i 
and z bands. 

It has been shown that using Petrosian (1976) 50% and 90% flux radii (e.g. Wadadekar 
2005) in addition to the SDSS five-band photometry one can improve results by as much as 
15% (see Table 3). The Petrosian 50% (90%) radius is the radius where 50% (90%) of the 
flux of the object is contained. Given the low redshifts of this catalog they can be assumed 
to be a rough measure of the angular size of the object. The ratio of these quantities is 
called the Petrosian inverse concentration index (CI) 1/c = r 50 /r9 which measures the slope 
of the light profile. The concentration index corresponds nicely to eyeball morphological 
classifications of large nearby galaxies (Strateva 2001; Shimasaku et al. 2001). 

The Petrosian Radii are also used in combination with a measure of the profile type 
from the SDSS photometric pipeline reduction called FracDev. FracDev comes from a linear 
combination of the best exponential and de Vaucouleurs profiles that are fit to the image in 
each band. FracDev is the de Vaucouleurs term (§3.1, Abazajian et al. 2004). It is 1 for a pure 
de Vaucouleurs profile typical of early-type galaxies and zero for a pure exponential profile 
typical of late-type galaxies. FracDev is represented as a floating point number between 
zero and I. This is similar to the use of the Sersic n-index (Sersic 1968) for morphological 
classification. The idea of using FracDev as a proxy for the Sersic index n comes from 
Vincent & Ryden (2004) who show that if Sersic profiles with l<n<4 accurately describe 
the SDSS galaxy early and late types then FracDev is a "monotonically increasing function 
of the Sersic index n, and thus can be used as a surrogate for n." For a recent discussion 
on Sersic profiles see Graham & Driver (2005). Blanton et al. (2003a,b) have also shown 
that Sersic fits to the azimuthally averaged radial profile of an SDSS object provide a better 
estimate of galaxy morphology than the Petrosian inverse concentration index (l/c=r 50 /r9 ) 
for the majority of MGS objects. However, at the time of this work these profiles were only 
available in the derived SDSS DR2 NYU-VAGC catalog of Blanton et al. (2005), and our 
own studies do not show appreciable improvement over the Petrosian inverse concentration 
index when used to calculate photometric redshifts. 

Measures of galaxy ellipticity and orientation, as projected on the sky, can be obtained 
from the SDSS photometric pipeline "Stokes" parameters Q and U (Stoughton et al. 2002). 
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These are the flux-weighted second moments of a particular isophote. 

M XX =(^), Myy = (^), M^^) (1) 

According to Stoughton et al. (2002) when the isophotes are self-similar ellipses one finds 
Q = M XX - M yy = cos(20), U = M xy = sin(20), (2) 

Since the Stokes values are related to the axis ratio and position angle, using these 
quantities in combination with those above should give additional information on the galaxy- 
types we are sampling and hence help in the estimation of photometric redshifts. However, 
in our studies we only utilize the Q parameter defined above as we see no improvement when 
using both Q and U. 



3.6. Description of the four data set types used 

Four classes of data sets are used in our analysis, based on the descriptions above. 

Data set 1: SDSS MGS GOOD quality photometry. All of the data come from the SDSS 
MGS with the GOOD quality flags set. There are six subsets in this data set as seen in 
Figure 3. 

1. u-g-r-i-z: contains only the SDSS five-band extinction corrected magnitudes. 

2. u-g-r-i-z-petro50-petro90: contains the u-g-r-i-z data and the Petrosian 50% and 90% 
radii in the r band. 

3. u-g-r-i-z-petro50-petro90-ci: contains the u-g-r-i-z-petro50-petro90 data and the Pet- 
rosian concentration index as described in § 3.5. 

4. u-g-r-i-z-petro50-petro90-ci-qr: contains the u-g-r-i-z-petro50-petro90-ci and the Stokes 
Q parameter as described in § 3.5. 

5. u-g-r-i-z-petro50-petro90-fracdev: contains the u-g-r-i-z-petro50-petro90 and the FracDev 
parameter as described in § 3.5. 

6. u-g-r-i-z-petro50-petro90-qr-fracdev: contains the u-g-r-i-z-petro50-petro90-fracdevand 
the Stokes Q parameter as described in § 3.5. 
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Each subset consists of 202,297 galaxies. 

Data set 2: SDSS MGS GREAT quality photometry. All of the data, as seen in Figure 4, 
come from the SDSS MGS with the GREAT quality flags set. There are six subsets named 
and described in the same way as for data set 1. Each subset consists of 33,328 galaxies. 

Data set 3: GALEX GR1, SDSS MGS GOOD quality photometry, and the 2MASS extended 
source catalogs labeled as nuv-fuv-ugriz-jhk. As seen in Figure 5a, it consists of the two 
ultraviolet magnitudes from the GALEX GR1 database (nuv and fuv). It has the five SDSS 
MGS extinction-corrected magnitudes (u,g,r,i,z) with the GOOD quality photometry flags 
set, but unlike data sets 1 and 2 there are no other SDSS inputs used. It also contains the 
three 2MASS extended source catalog magnitudes (j,h,k s ). The total data set consists of 
3095 galaxies. 

Data set 4: GALEX GR1, SDSS MGS GREAT quality photometry, and the 2MASS extended 
source catalogs. As shown in Figure 5b it is nearly the same as data set 3, except the SDSS 
MGS GREAT quality photometry flags are set. The total data set consists of 326 galaxies. 

4. TRAINING METHODS 

We estimate the photometric redshifts of the galaxies in the SDSS, 2MASS and GALEX 
databases using several classes of algorithms: simple linear and quadratic regression, neu- 
ral networks, and Gaussian processes. These methods have different properties and make 
different assumptions about the underlying data generating process that will be discussed 
below. 



4.1. Linear and Quadratic fits 

Linear and quadratic polynomial fitting along the lines of Connolly et al. (1995); Hsieh 
et al. (2005) are used as a way to benchmark the new methods discussed below. The linear 
regression for the SDSS ugriz magnitudes would be given by an equation of the form: 

Z = A + Bu + Cg + Dr + Ei + Fz (3) 

Where A, B, C, D, E, and F result from the fit. All data points are weighted equally. 
Z is the redshift: the spectoscopic one when training and the photometric one when testing. 

The quadratic form is similar and again all points are weighted equally. 
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Fig. 1. — The left figure is a cartoon to help illustrate the need for a Virtual Sensor. We have 
spectral measurements from two sensors Si and S 2 , (solid and dot-dashed lines, respectively). 
We wish to estimate the output of sensor Si for a wavelength where there is no actual 
measurement from the sensor. Note that some sensor measurements overlap perfectly, as 
in the case of wavelength = 3, and in other cases, such as wavelength = 1, there is some 
overlap in the measurements. The right figure shows the sensitivity through an airmass of 
1.3 for extended sources in the five SDSS (u,g,r,i,z) filter bandpasses with the spectrum of 
NGC5102 (Storchi-Bergmann et al. 1995) purposely redshifted 1000A overlayed. 

Z = A+Bu+Cg+Dr + Ei+Fz + Guu+Hgg + Irr + Jii+Kzz+Lug + Mur+Nui + Ouz + Pgr+Qgi+Rgz + Sri+Trz+Uiz (4) 

4.2. The Artificial Neural Network approach 

The artificial neural network (ANNz) approach of Collister & Lahav (2004) is specifi- 
cally designed to calculate photometric redshifts from any galaxy properties the user deems 
desirable. It has been demonstrated to work remarkably well on the SDSS DR1 (Collister 
& Lahav 2004). The ANNz package contains code to run back-propagation neural networks 
with arbitrary numbers of hidden units, layers and transfer functions. We chose two hidden 
units, and 10 nodes in each of these units (see Figure 2). See the next section for a more 
detailed description of neural networks in general, or see Collister & Lahav (2004). 
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4.3. 



The Ensemble Model 



Back-propagation neural networks have been used extensively in a variety of applica- 
tions since their inception. A good summary of the methods we use can be found in Bishop 
(1995). Neural networks are a form of nonlinear regression in which a mapping, defined as 
a linear combination of nonlinear functions of the inputs, are used to approximate a desired 
target value. The weights of the linear combination are usually set using an approach, based 
on gradient descent of a cost function, that is defined between the target value and the esti- 
mated value. The cost function usually has multiple local minima, and the model obtained 
at the end of a training cycle usually corresponds to one such minima and not to a global 
minimum. The global minimum would correspond to the model that best approximates the 
training set. Generalization of the model on a test set (i.e., data that is not used during the 
model building phase) can be shown to be poor if a global minimum is reached due to the 
phenomenon of over-fitting. 

The following material is a standard demonstration that although the neural network 
computes a nonlinear function of the inputs, distribution of errors follows a Gaussian if the 
squared error cost function is minimized. The cost function encodes an underlying model of 
the distribution of errors. For example, suppose we are given a data set of inputs X, targets 
y, and a model parameterized by 0. The standard method of obtaining the parameter O 
is by maximizing the likelihood of observing the data V = (X, y) with the model 6. Thus, 
we need to maximize: 



The function P(Q) represents the prior distribution over model parameters. If we have 
knowledge about the ways in which the weights of the model are distributed before the data 
arrives, such information can be encoded in the prior. Neal (1996) has shown that in the 
limit of an infinitely large network, certain simple assumptions on the distribution of the 
initial weights make a neural network converge to a Gaussian process. If we assume that 
the errors are normally distributed, we can write the likelihood of an input pattern Xj G X 



P(Q\V) 



P(V\S)P(S) 
P(P) 

oc p(p|e)p(e) 



and we note that P(V\Q) = P(X,y\Q) and so: 



p(x,y\e) = p(y\x,e)p(x\e) 



(5) 
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having target yi G y with variance a as 



2 oc , 5. 



L(yi\xi,Q) = P(yi\xi,Q) 



i (yi - ViY 



— exp ., 
27ra 2d 2 



The product of these likelihoods across the N data points in the data set T> is the likelihood 
of the entire data set: 



N 



p(y\x,Q) = Y[p( yi Ke) 



n^exp-^ (6) 



1=1 



From this equation, it is straightforward to see that maximizing the log of this likelihood 
function is equivalent to minimizing the squared error, which is the standard cost function 
for feed-forward neural networks used in regression problems. 

Neural networks are often depicted as a directed graph consisting of nodes and circs duS 
shown in Figure 2. For a p dimensional input x the value at the k hidden nodes z is the 
k x 1 vector: 

z = s(iy 1 x + b 1 ) (7) 
and the final estimate of the target y is given by y: 

y = W 2 z + b 2 

= /(x,6) (8) 

where W\ is a k x p matrix, b x is a p x 1 vector, W 2 is a k x I matrix and b 2 is an / x 1 
vector. In the case where the network only generates one output per input pattern as is the 
case in the studies presented here, 1 = 1. 

The function s is a nonlinear function and is chosen as a sigmoid: 

s(a) = —, r. (9) 

v ' 1 + exp(-a) v ' 

Neural networks are trained to fit data by maximizing the likelihood of the data given 
the parameters. The model obtained through this maximization process corresponds to a 



5 We follow the convention that bold-faced notation indicates vectors and non-bold faced symbols indicate 
scalars 
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Fig. 2. — A graphical depiction of a neural network with 4 inputs, 4 hidden units, and 3 
outputs. The outputs are nonlinear functions of the inputs. 

single model sampled from the space of models parameterized by the model parameters 
0. If we assume Gaussian errors, we have shown that the cost function is the well-known 
sum-squared error criterion. The network is trained by performing gradient descent in the 
parameter space 9. The derivative of this cost function with respect to each weight in the 
network is calculated and the weights are adjusted to reduce the error. Because the cost 
function is non-convex, the optimization problem gets caught in local minima, thus making 
training and model optimization difficult. In order to reduce the effects of local minima, we 
performed bagging or Bootstrap AGgregation (Breiman 1996). In this procedure, we sample 
the data set V M times with replacement. For each sample, we build one neural network 
in the ensemble of M neural networks. The final prediction is formed by taking the mean 
prediction of all M neural networks: 



Breiman (1996) showed that this procedure results in a regression model with lower error. 
Our results, which we term our "ensemble model" (see Tables 4-6), show the effects of the 
local minima and the distribution of errors that result from this problem on the SDSS, 
2MASS, and GALEX data sets. 




(10) 



i=l 
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4.4. 



Kernel Methods and Gaussian Processes 



In many ways, neural networks are attractive models for nonlinear regression problems 
because they can scale to large data sets, and provide a good baseline from which to compare 
other methods. In the machine learning literature, kernel methods have in many ways 
subsumed neural networks because it was shown that as the number of hidden units increases, 
if we assume that the weights and biases of the neural network are drawn from a Gaussian 
distribution (thus assuming that P(Q) is Gaussian), the prior distribution over functions 
implied by such weights and biases converges to a Gaussian process (Neal 1996; Cristianini 
& Shawe-Taylor 2000). 

To describe a Gaussian process, we first note that in the case of a neural network, y is 
defined as a specific nonlinear function of x, parametrized by 6, y = /(x, 0). In a Gaussian 
process, we actually define a prior distribution over the space of functions / which is assumed 
to be Gaussian. Thus, we have: 



The marginals for all subsets of variables of a Gaussian process are Gaussian. The covariance 
matrix X measures the degree of correlation between inputs Xj and Xj. The choice of the 
correlation function £ defines a potentially nonlinear relationship between the inputs and 
the outputs. If we choose E(xj,Xj) = if(xj,Xj), where K is a positive definite function, we 
obtain a specific Gaussian process induced by the kernel function K. To make a prediction 
with a Gaussian process, we assume that a covariance function has been chosen, and then 
compute: 



We know that this distribution will be Gaussian, and the mean and variance of the distri- 
bution can be computed as follows (Cristianini & Shawe-Taylor 2000): 



where k = X(xj,x), K = if(xj,Xj), and A is an externally tuned parameter that represents 
the noise in the output. 

The nonlinearity in the model comes from the choice of the kernel function K. Typical 
choices for K include the radial basis function: if(xj,Xj) = exp(— ^\ |xj — x,,-|| 2 ) or the 
polynomial kernel if(xj,Xj) = (1 + xfxj) r . We choose the latter for this study. 






(12) 




(13) 
(14) 
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It can be shown that Gaussian process regression, as described above, builds a linear 
model in a very high dimensional feature space that is induced by the nonlinear kernel func- 
tion K. One distinct advantage of the Gaussian process is that it delivers point predictions 
as well as a confidence interval around the predictions. 

5. DISCUSSION 

Results discussed below include the two different SDSS photometric quality flag combi- 
nations used called GOOD and GREAT. For the SDSS data 10 different photometric pipeline 
output parameters are utilized in different combinations (see § 3.6): u,g,r,i, and z extinction 
corrected model magnitudes, r band Petrosian 50% flux radii (petro50) and Petrosian 90% 
flux radii (petro90), the Petrosian inverse concentration index (CI) derived from these two 
quantities, the r band FracDev quantity (FD), and r band Stokes value all as defined in § 3.5 
and 3.6. Results are also discussed from the combined catalogs of the SDSS MGS (u,g,r,i,z 
magnitudes only) galaxies with redshifts, the 2MASS extended source catalog (j,h,k s mag- 
nitudes) , and the GALEX All Sky Survey (nuv,fuv magnitudes) data sets. The sample sizes 
for each of these data sets are also given in Tables 4-6. 

In order to make our results as comparable as possible the same validation, training 
and testing sizes are used in our analysis for ANNz, ensemble model, linear, and quadratic 
fits: training=89%, validation=l%, and testing=10%. In order to put proper confidence 
intervals on the error estimates from these methods, bootstrap resampling (Efron 1979; 
Efron & Tibshirani 1993) is utilized on the training data: 90% of the training data are used 
for each of 100 bootstraps. 

For the Gaussian processes the situation is slightly different. The same percentages for 
training, validation, and testing are utilized. However, for data sets 1-3 1000 samples from 
the training data are used for each of the bootstrap runs. For data set 4 only 50 samples 
are utilized for each of the bootstrap runs. The Gaussian processes require matrix inversion 
which is an 0(N 3 ) operation. Hence small training sets were required to complete this 
project in a reasonable time frame. 

In Tables 4-6 we report robust 90% confidence intervals around our 50% RMS result 
for all of these methods from the bootstrap resampling. Figures 3, 4 and 5 show the same 
information, albeit in a more detailed graphical format. 

Table 4 and Figure 3 demonstrate our results on data set 1. The plots in Figure 3 
clearly demonstrate that the ANNz and E-model neural network methods are superior in 
their accuracy over nearly all bootstrap samples (labeled "model number" in Figures 3-5) 
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no matter which input quantities are used. The linear and quadratic fits fair the worse as 
is expected. The Gaussian process model is usually found in between. However, it must be 
remembered that only ~1000 sample points are used for training in each case and therefore 
it is possible that it is not sampling all of the possible redshift-color space. Nonetheless it 
does an excellent job given the small data samples used in comparison to the other methods. 
It is also clear that the inputs used reproduce very similar results once one goes beyond the 
five-band magnitudes of the SDSS and quantities like the Petrosian concentration index or 
the Stokes measure of ellipticity are used. The best method, our ensemble model, regularly 
reproduces RMS values of less than 0.019 no matter the confidence level (or bootstrap 
sample) used. 

Table 5 and Figure 4 for data set 2 give results very similar to those of data set 1 just 
discussed. Lower RMS errors are obtained than that of the GOOD quality data, but there 
is more variation in the confidence intervals evidenced by increasing slope as a function of 
bootstrap sample in Figure 4. As with data set 1, the RMS error results are lower but 
similar when the five-band SDSS magnitudes are supplemented with quantities such as the 
Petrosian radii or the FracDev measurement. 

While data set 2 does on occasion have slightly better RMS errors than data set 1, in 
general there is little difference in the use of higher quality photometry and we would not 
recommend the use of the higher quality photometry of data set 2 as described herein in 
general. 

Table 6 and Figure 5 are the results of using data sets 3 and 4. Figure 5b for data set 
4 (which has better photometric quality) shows again an increase in the variability of the 
RMS error as a function of bootstrap sample larger than that of the GOOD sample from 
data set 3 in Figure 5a. In general Figure 5b with the better SDSS photometry of data set 
4 has RMS errors either the same or worse than those from the SDSS only data sets 1 and 
2 in Figure 3 and 4. The variability in the RMS error as a function of bootstrap and the 
generally large RMS errors leads one to believe that the sample size is too small to train 
on. Given that there are only 326 objects in data set 4 this should not be too surprising. 
The apparent ability of the quadratic regression to do so well might point one to possible 
over-fitting of the data. 

However, in Figure 5a the story for data set 3 is very different. Here the variability is 
much less a function of bootstrap, the RMS errors are generally quite low, and the prediction 
abilities of the different methods are consistent with those observed in the SDSS data sets 1 
and 2 found in Figures 3 and 4 The ensemble model once again surpasses all other methods 
for 95% of the bootstrap samples followed closely by the Gaussian processes and ANNz 
methods. Here one can see that the Gaussian process method is more competitive as it is 
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likely to be sampling all possible templates of the 3095 input galaxies even with only 1000 
samples per run. 

In order to show the effects of sampling and local minima for the ensemble model on 
the quality of redshift predictions we show a set of 100 neural networks and show their final 
RMS errors in Figures 6 and 7. Each neural network is built by drawing a sample from the 
training set with replacement and then performing the gradient descent maximization process 
described earlier. We train until the model converges, which is defined as the gradient- 
descent iteration at which the magnitude of the gradient drops below a preset threshold. 
This model corresponds to one point on the top panel of Figures 6 and 7. 

The middle panel of Figure 7 shows the cumulative distribution function for the errors 
shown in the top panel. The x-axis is the RMS error (eo), and the y-axis is P(RMS < eo). 
The plot indicates that about 70% of the models we generated have an RMS error less than 
0.1. This plot also indicates that reporting the minimum observed RMS value, which is done 
throughout the literature on this topic (Collister & Lahav 2004, e.g. ANNz) is misleading. 
For the models computed for this empirical cumulative distribution function, the quantity 
P(RMS < e ) rapidly vanishes as e — ► 0.04. This implies that such models are not only 
highly unlikely, but also highly non-robust. 

In order to contrast this distribution with the empirical distributions observed on other 
data sets, we chose to show Figure 6. This figure, unlike the previous figure discussed, shows 
that the variation imposed by the optimization procedure, combined with the variations 
in the data set, have a relatively small effect on the quality of predictions: nearly 70% of 
the models have a very low error rate, with the distribution rapidly increasing after that. 
Note that the empirical cumulative distribution function shown in the bottom panel rises 
sharply at the onset of the curve. This indicates that 70% of the models have an error less 
than about 0.025. Again, this variation and apparent combined stability of the data set and 
optimization procedure would be entirely lost if only the minimum value of the distribution 
was reported. 

For comparison in Figure 8 one can see the known spectroscopic redshift plotted against 
the calculated photometric redshift from the test data for our five algorithms used against 
the ugriz-petro50-petro90-ci-qr GREAT data set (part of data set 2) as presented in Table 5. 
Note that the Gaussian process plot (bottom middle panel) has a larger number of points, 
which is due to the smaller training set and larger testing sets used in this algorithm. The 
plot in the bottom right-hand corner of Figure 8 contains the Gaussian process model results 
against the GREAT nuv-fuv-ugriz-jhk data set 4 as presented in Table 6. 
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6. CONCLUSION 

We have shown that photometric redshift accuracy of SDSS photometric data can be 
improved over that of previous attempts through a careful choice of additional photometric 
pipeline outputs that are related to angular size and morphology. Adding additional band- 
passes from the ultraviolet (GALEX) and infrared (2MASS) can be even more helpful, but 
the current sample sizes are too small to be useful for large-scale structure studies. 

We have also shown that there is little difference in the use of the higher quality SDSS 
photometry as defined herein. Hence we would not recommend its use because it decreases 
the sample size markedly and does not decrease the RMS errors in the photometric redshift 
prediction. 

We wish to stress that when using a neural network model for studies of photometric 
redshifts care must be taken when reporting the results of such models. There is a tendency 
in the astronomical literature to report only the best-fit model, which is often unlikely to 
be the one used to calculate the final photometric redshift estimates. 

The effects of local minima on prediction have also been discussed in some detail and 
we describe the way in which an ensemble of neural networks can reduce the problem. 

We have also discussed the result of using Gaussian processes for regression, which 
avoids many of the local minima problems that occur with neural networks. One of the 
great strengths of Gaussian processes as used herein is the ability to use small training 
sets, which may be helpful in high-redshift studies where very small numbers of measured 
redshifts are available. 

Finally, it should be noted that the TS methods described herein are only useful in 
a limited set of circumstances. In this work the SDSS MGS has been utilized since it is 
considered a complete photometric and spectroscopic survey in the sense that the magnitude 
limit of the survey is well understood, a broad range of colors are measured, and accurate 
redshifts obtained. It would be folly to attempt to use TS methods in a situation where 
these are poorly defined. For example, to simply apply TS methods to the entire SDSS 
galaxy photometric and redshift catalog without taking into account the limitations in the 
quantity and quality of photometry and redshifts would likely give one results that could not 
be quantified properly and give misleading conclusions. As well, it has been stressed that 
TS-methods have not been widely used in z>l surveys because thus far a complete sample 
of redshifts over the observed colors and magnitudes of the galaxies of interest have not been 
measured. This will change as larger telescopes with more sensitive detectors appear, but 
TS methods will not be useful for those situations where insufficient numbers of redshifts, 
colors and magnitudes exist to cover the required spaces. 
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A. SDSS QUERIES 

Below are the queries used against the SDSS DR2 and DR3 databases to obtain the 
data used throughout this paper. 

Query used to obtain data set 1: 
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Select p.ObjID, p.ra, p. dec, 

p.derecLu, p.dered_g, p.dered_r, p.dered_i, p.dered_z, 
p.petroR50_r, p.petroR90_r, p.fracDeV_r, p.q_r, 
p.Err_u, p.Err_g, p.Err_r, p.Errj, p.Err_z, 
p.petroR50Err_r, p.petroR90Err_r, p.qErr_r, 
s.z, s.zErr, s.zConf 

into mydb.dr3cfracdpetq from SpecOBJall s, PhotoObjall p 
WHERE s.specobjid=p.specobjid 
and s.zConf>0.95 

and (p.primtarget & 0x00000040 > 0) 

and ( ((flags & 0x8) = 0) and ((flags & 0x2) = 0) and ((flags & 0x40000) = 0)) 

Query used to obtain data set 2: 
Select p.ObjID, p.ra, p. dec, 

p.dered_u, p.dered_g, p.dered_r, p.dered_i, p.dered_z, 
p.petroR50_r, p.petroR90_r, p.fracDeV_r, p.q_r, 
p.Err_u, p.Err_g, p.Err_r, p.Err_i, p.Err_z, 
p.petroR50Err_r, p.petroR90Err_r, p.qErr_r, 
s.z, s.zErr, s.zConf 

into mydb.dr3cfracdpetq from SpecOBJall s, PhotoObjall p 
WHERE s.specobjid=p.specobjid 
and s.zConf>0.95 

and (p.primtarget k 0x00000040 > 0) 

and ( ((flags & 0x8) = 0) and ((flags & 0x2) = 0) and ((flags & 0x40000) 
and ((flags & OxfO) =0) and ((flags & 0xf000)=0) and ((flags & 0x20000) 

Query used to obtain data set 3: 
Select p.objID, p.ra, p. dec, 

g.NUV_MAG, g.NUV_MAGERR, g.FUV_MAG, g.FUV_MAGERR, 
p.u, p.Err_u, p.g, p.Err_g, p.r, p.Err_r, p.i, p.Err_i, p.z, p.Err_z, 
t.j_m_k20fe, t.j_msig_k20fe, t.h_m_k20fe, t.h_msig_k20fe, t.k_m±20fe, t.k_msig±20fe, 
s.z, s.zErr, s.zConf 

FROM TWOMASS.dbo.xsc t, BESTDR2.dbo.PhotoObjAll p, GALEXDRONE.dbo.nuvfuv 
g, BESTDR2.dbo.SpecOBJall s 
WHERE s.specobjid=p.specobjid 
and s.zConf>0.95 and s.zWarning=0 
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and g.NUV_MAG>-99 and g.FUV_MAG>-99 
and t.cc_flg='0' 

and (p.primtarget k 0x00000040 > 0) 

and ((flags k 0x8) =0) and ((flags k 0x2) =0) and ((flags k 0x40000) =0) 
and p. objid=BESTDR2.dbo.fgetnearestobjideq(t.ra,t. dec, 0.08333) 
and p.objid=BESTDR2.dbo.fgetnearestobjideq(g.RA,g.DEC,0.08333) 



Query used to obtain data set 4: 
Select p.objID, p.ra, p. dec, 

g.NUV_MAG, g.NUV_MAGERR, g.FUV_MAG, g.FUV_MAGERR, 

p.u, p.Err_u, p.g, p.Err_g, p.r, p.Err_r, p.i, p.Err_i, p.z, p.Err_z, 

t.j_m_k20fe, t.j_msig_k20fe, t.h_m_k20fe, t.h_msig_k20fe, t.k_m±20fe, t.k_msig±20fe, 

s.z, s.zErr, s.zConf 

FROM TWOMASS.dbo.xsc t, BESTDR2.dbo.PhotoObjAll p, GALEXDRONE.dbo.nuvfuv 

g, BESTDR2.dbo.SpecOBJall s 

WHERE s.specobjid=p.specobjid 

and s.zConf>0.95 and s.zWarning=0 

and g.NUV_MAG>-99 and g.FUV_MAG>-99 

and t.cc_flg='0' 

and (p.primtarget k 0x00000040 > 0) 

and ( ((flags k 0x8) =0) and ((flags k 0x2) =0) and ((flags k 0x40000) =0) 
and ((flags k 0x10) =0) and ((flags k 0xl000)=0) and ((flags k 0x20000) = 0) ) 
and p.objid=BESTDR2.dbo.fgetnearestobjideq(t.ra,t.dec,0. 08333) 
and p.objid=BESTDR2.dbo.fgetnearestobjideq(g.RA,g.DEC,0.08333) 
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Fig. 3. — (a) Top six plots containing the five training set methods for each of the six inputs 
applied to the SDSS GOOD data sets known as data set 1. (b) The bottom five plots are our 
training-set results for each of the five training methods applied to the six different SDSS 
GOOD inputs. 
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Fig. 4. — (a) The top six plots contain the five training methods for each of the six inputs 
applied to the SDSS GREAT data sets known as data set 2. (b) The bottom five plots 
are our training-set results for each of the five training methods applied to the six different 
SDSS GREAT inputs. 



-29- 



Dataset 3 



0.05 
0.045 
0.04 



2 0.035 

LLJ 
W 

1 0.03 



0.025 



0.02 



0.015 



■ Linear 
■Quadratic 

■ ANNZ 

■ E-Model 
■GP 








20 



0.05 



0.045 



0.04 



2 0.035 

UJ 

w 

1 0.03 



0.025 



0.02 



■ Linear 
■Quadratic 
■ANNZ 

■ E-Model 
■GP 



40 60 
Model Number 

Dataset 4 



80 



100 




0.015 [ 







20 



40 60 
Model Number 



80 



100 



Fig. 5. — (a) The top plot shows the five training methods applied to data set 3. (b) The 
bottom plot shows the five training methods applied to data set 4. 
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Fig. 6. — The top panel of this figure shows the distribution of errors for 100 neural networks 
on the GREAT E-Model ugriz-petro50-petro90-qr-fracdev, data set 2 (see Table 5). The 
middle panel shows the empirical cumulative distribution function for the RMS errors for 
the 100 models shown in the top panel. The bottom panel shows the probability distribution 
function of the RMS error. See § 5 for more details. 
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Fig. 7. — The top panel of this figure shows the distribution of errors for 100 neural networks 
on the GREAT E-Model nuv-fuv-ugriz-jhk, data set 4 (see Table 6). The middle panel shows 
the empirical cumulative distribution function for the RMS errors for the 100 models shown 
in the top panel. The bottom panel shows the probability distribution function of the RMS 
error. See § 5 for more details. 
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Fig. 8. — Spectroscopic redshift is plotted versus calculated photometric redshift for the 
GREAT ugriz-petro50-petro90-ci-qr data set 2 with 5 algorithms while the 6th plot uses the 
Gaussian process model for the nuv-fuv-ugriz-jhk GREAT data set 4. See Table 4 for details 
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Table 3: Different Photometric Redshift Techniques and Accuracies. 



Method Name 


CRMS 


Data set 1 


Inputs 2 


Source 


CWW 


0.0666 


SDSS- 


■EDR 


ugriz 


Csabai ct al. (2003) 


Bruzual-Charlot 


0.0552 


SDSS- 


-EDR 


ugriz 


Csabai ct al. (2003) 


ClassX 


0.0340 


SDSS- 


-DR2 


ugriz 


Suchkov et al. (2005) 


Polynomial 


0.0318 


SDSS- 


■EDR 


ugriz 


Csabai ct al. (2003) 


Support Vector Machine 


0.0270 


SDSS- 


■DR2 


ugriz 


Wadadekar (2005) 


Kd-tree 


0.0254 


SDSS- 


-EDR 


ugriz 


Csabai ct al. (2003) 


Support Vector Machine 


0.0230 


SDSS- 


■DR2 


ugriz+r50+r90 


Wadadekar (2005) 


Artificial Neural Network 


0.0229 


SDSS- 


-DR1 


ugriz 


Collistcr & Lahav (2004) 


Artificial Neural Network 


0.022-0.024 


SDSS- 


-DR1 


A 


Vanzella ct al. (2004) 


Artificial Neural Network 


0.0200-0.025 


SDSS- 


■EDR 


B 


Tagliaferri et al. (2003) 


Artificial Neural Network 


0.0200-0.026 


SDSS- 


■EDR 


C 


Ball et al. (2004) 


Polynomial 


0.025 


SDSS- 


-DRl.GALEX 


ugriz+nuv 


Budavari (2005) 



^DSS-EDR Early Data Release (Stoughton ct al. 2002), SDSS-DR1 Data Release 1 (Abazajian et al. 2003), 

SDSS-DR2 Data Release 2 (Abazajian et al. 2004) 
2 ugriz=5 SDSS magnitudes, r50=Pctrosian 50% light radius in r band, r90=Petrosian 90% light radius in 

r band, nuv=Near-Ultraviolet GALEX bandpass. For A see Vanzella et al. (2004), for B see Tagliaferri 

ct al. (2003) and for C see Ball et al. (2004) for a list of the large variety of inputs used in each of these 

publications. 



Table 4. Photometric Redshift prediction RMS errors with confidence levels for Dataset 1, 

202,297 objects 



Input-parameters 1 




Linear 






Quadratic 






ANNz 






E-Model 






GP 






(50%) 


(10%) 


(90%) 


(50%) 


(10%) 


(90%) 


(50%) 


(10%) 


(90%) 


(50%) 


(10%) 


(90%) 


(50%) 


(10%) 


(90%) 


ugriz 


0.0283 


0.0282 


0.0284 


0.0255 


0.0255 


0.0255 


0.0206 


0.0205 


0.0208 


0.0201 


0.0198 


0.0205 


0.0227 


0.0225 


0.0230 


ugriz+r50+r90 


0.0288 


0.0288 


0.0289 


0.0245 


0.0244 


0.0245 


0.0194 


0.0192 


0.0196 


0.0189 


0.0187 


0.0194 


0.0236 


0.0233 


0.0241 


ugriz+r50+r90+CI 


0.0286 


0.0285 


0.0286 


0.0264 


0.0263 


0.0265 


0.0194 


0.0191 


0.0195 


0.0187 


0.0185 


0.0190 


0.0239 


0.0236 


0.0243 


ugriz+r50+r90+CI+QR 


0.0296 


0.0295 


0.0296 


0.0245 


0.0244 


0.0246 


0.0192 


0.0189 


0.0194 


0.0186 


0.0184 


0.0190 


0.0241 


0.0238 


0.0245 


ugriz+r50+r90+FD 


0.0286 


0.0286 


0.0287 


0.0263 


0.0261 


0.0266 


0.0189 


0.0188 


0.0192 


0.0183 


0.0181 


0.0187 


0.0236 


0.0233 


0.0241 


ugriz+r50+r90+FD+QR 


0.0290 


0.0289 


0.0290 


0.0243 


0.0242 


0.0243 


0.0189 


0.0187 


0.0191 


0.0185 


0.0183 


0.0186 


0.0239 


0.0235 


0.0242 



J ugriz=5 SDSS magnitudes, r50=Petrosian 50% light radius in r band, r90=Petrosian 90% light radius in r band, CI=Petrosian Inverse Concentration Index, FD=FracDcv 
value, QR=Stokcs value. See § 3.6 for more details. 
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Table 5. Photometric Redshift prediction RMS errors with confidence levels for Dataset 2, 

33,328 objects 



Input-parameters 1 




Linear 






Quadratic 






ANNz 






E-Model 






GP 






(50%) 


(10%) 


(90%) 


(50%) 


(10%) 


(90%) 


(50%) 


(10%) 


(90%) 


(50%) 


(10%) 


(90%) 


(50%) 


(10%) 


(90%) 


ugriz 


0.0242 


0.0241 


0.0242 


0.0225 


0.0225 


0.0225 


0.0208 


0.0207 


0.0209 


0.0197 


0.0194 


0.0200 


0.0243 


0.0237 


0.0248 


ugriz+r50+r90 


0.0227 


0.0227 


0.0227 


0.0217 


0.0216 


0.0217 


0.0201 


0.0199 


0.0202 


0.0194 


0.0192 


0.0198 


0.0237 


0.0232 


0.0241 


ugriz+r50+r90+CI 


0.0240 


0.0240 


0.0240 


0.0226 


0.0226 


0.0226 


0.0200 


0.0199 


0.0202 


0.0192 


0.0191 


0.0194 


0.0242 


0.0238 


0.0247 


ugriz+r50+r90+CI+QR 


0.0235 


0.0235 


0.0235 


0.0213 


0.0213 


0.0213 


0.0197 


0.0195 


0.0198 


0.0185 


0.0183 


0.0189 


0.0243 


0.0237 


0.0255 


ugriz+r50+r90+FD 


0.0243 


0.0243 


0.0243 


0.0220 


0.0219 


0.0220 


0.0196 


0.0195 


0.0198 


0.0185 


0.0183 


0.0189 


0.0230 


0.0226 


0.0233 


ugriz+r50+r90+FD+QR 


0.0234 


0.0233 


0.0234 


0.0220 


0.0219 


0.0220 


0.0194 


0.0193 


0.0196 


0.0185 


0.0184 


0.0188 


0.0242 


0.0238 


0.0245 



J ugriz=5 SDSS magnitudes, r50=Petrosian 50% light radius in r band, r90=Petrosian 90% light radius in r band, CI=Petrosian Inverse Concentration Index, FD=FracDcv 
value, QR=Stokcs value. See §3.6 for more details. 
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Table 6. Photometric Redshift prediction RMS errors with confidence levels for Datasets 

3 and 4. 



Input-parameters 1 ' Linear ' Quadratic ' ANNz I E-Model j GP 





(50%) 


(10%) 


(90%) 


(50%) 


(10%) 


(90%) 


(50%) 


(10%) 


(90%) 


(50%) 


(10%) 


(90%) 


(50%) 


(10%) 


(90%) 


nuv+fuv+ugriz+j hk 2 
nuv+fuv+ugriz+j hk 3 


0.0201 
0.0254 


0.0200 
0.0249 


0.0201 
0.0259 


0.0200 
0.0220 


0.0199 
0.0214 


0.0202 
0.0229 


0.0191 
0.0209 


0.0188 
0.0204 


0.0194 
0.0222 


0.0171 
0.0369 


0.0161 
0.0296 


0.0195 
0.0475 


0.0195 
0.0267 


0.0189 
0.0249 


0.0203 
0.0291 



x ugriz=5 SDSS magnitudes, nuv=GALEX NUV magnitude, fuv=GALEX FUV magnitude, jhk=2MASS jkh magnitudes. See § 3.6 for more details. 
2 Datasct 3: 3095 combined catalog objects 
3 Dataset 4: 326 combined catalog objects 
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